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Abstract 

An involution is a permutation that is its own inverse. Given a permutation a of 
[n], let Nn,(cj) denote the number of ways to write cr as a product of two involutions 
of [n]. If we endow the symmetric groups Sn with uniform probability measures, then 
the random variables N„ are asymptotically lognormal. 

The proof is based upon the observation that, for most permutations fi, N„((t) can 
be well approximated by B„((t), the product of the cycle lengths of a. Asymptotic 
lognormality of can therefore be deduced from Erdos and Turan’s theorem that B„ 
is itself asymptotically lognormal. 
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1 Introduction 


An involution is a permutation that is its own inverse, i.e. a permutation whose cycle lengths 
are all less than or equal to two. If a is a permutation of [n], let N„(ct) be the number of 
ordered pairs of involutions Ti , T 2 of [n] such that a = r 2 o n. The goal of this paper is 
to determine the asymptotic distribution of the random variable N„ for uniform random 
permutations a. 

Let Tn be the set of all involutions of [n\. The cardinalities |7^|,n = 1,2,3,... have 
been extensively investigated and form OEIS Sequence A000085 [25]. See also Amdeberhan 
and Moll |1] for more recent work. Of particular importance for this paper is an asymptotic 
formula that was derived by Chowla, Herstein, and Moore [Sj: 


Tn 


rsj 


^/2Vey 


( 1 . 1 ) 


Related approximations appear in Moser and Wyman [12], |2Uj . 

Vivaldi and Roberts [22] studied the random permutations that are obtained by multiply¬ 
ing random involutions with various restrictions on their hxed points. However the product 
of two uniformly random involutions is not a uniformly random permutation. For example 
the identity permutation is generated with probability which is much larger than 
Thus is clearly not constant. 

Let (o') = 1 if T 2 o n = cr (and 1^2,= 0 otherwise), so that 


N 


n 


E 


Ti,r2 


- T2 ,ri • 


(1.2) 


Using this representation and Stirling’s formula, it is straightforward to estimate the average 
number of factorizations mi: 


E„(N„) = 

n,T2 cr 



rsj 


n\ \/Siren 


(1.3) 


Our results show that the average in (11.31) is misleadingly large; if n is large, then for most 
permutations a G Sm one has 

g(i-qiog2n ^ ^ ^(l+,)log2n_ 


Another consequence of the sum of indicators representation (II.2p is that maxg- N„(cr) = 
\Tn\- The unique permutation that attains the maximum is the identity permutation that 
Exes all n points. At the other extreme, for n > 2, mino-N„(cr) = n — 1. The minimum is 
attained only by the permutations that have a cycle of length n — 1. These two extremal 
results are stated on page 161 of Lugo’s thesis mi and are also proved later in [7] . Lugo also 
conjectured, but did not prove, that N„ is asymptotically lognormal. 
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There is an extensive literatnre on formnlas for the nnmber of ways to write a permntation 
as the prodnct of two or more permntations with various restrictions on the conjugacy classes 
of the factors of the product. Without trying to review that literature, we refer readers 
to [13], [HI as possible starting points. For asymptotic problems, even an explicit formula 
can be quite useless if it is too complicated. However, as the authors in [13] and [H] point 
out, formulas with non-negative terms tend to be more tractable. In this paper, we make 
use of one such formula: 


N„(a) 


n Ffe/2J 

HE 


k=l j=0 


^Ck\ 

2^j\{ck-2j)V 


(1.4) 


where Ck = Cfc(cr) denotes the number of cycles of length k that a has. As far as we know, 
the first complete proofs of (11.41) are in Petersen and Tenner [21] and Lugo [T7] . 

We use the formula fll.4p to prove that, for most permutations a, N„(cr) can be well 
approximated by B„(ct) = product of the cycle lengths of a. The random 

variable B„ has been studied by many authors, beginning with the work of Erdos and 
Turan [TO] . m- Asymptotic lognormality of N„ will be deduced from the known fact that 
B„ is asymptotically lognormal. 


2 Factorizations 

This section is more or less expository: we discuss the known factorization (II.4p . For each 
integer x, let x = x—n[^\ denote the integer remainder when x is divided by n. (The positive 
integer n will be clear from context.) Yang, Ellis, Mamakani, and Ruskey [28] proved the 
following lemma. 

Lemma 2.1. There are exactly n ways to factor the n-cycle a = (0,1,..., n — 1) as the 
product of two involutions of {0,1,2,... ,n — 1}. The n factorizations are a = Ik o Ik-i, 
1 < k < n, where Ik{x) = k — x is the integer remainder when k — x is divided by n. 

Our notational preference for modular arithmetic is influenced by page 158 of [12], where 
the setting is different but the factorization is similar. In [28], the proof of lemma ITT] is 
quite short, elementary, and easy to read. As we show in proposition 12.41 below, the proof 
of lemma 12.11 can be adapted to the product of two m cycles, and therefore can be used as 
the basis for an alternative proof of fjl.4p. Corresponding lemmas appear in [I^ and 1211 , 
but the derivations there are based on a graph theoretical insight and appear to be different 
from the proof that is presented here. 

For any permutation a, we can apply lemma 12.11 separately to each of the cycles of a. 
Therefore a consequence of lemma 12.11 is that the product of the cycle lengths is a lower 
bound: 

N„(cr) > B„((t). (2.1) 
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This inequality is not sharp because, in the factorization a = T 2 oti, there is no requirement 
that the cycles of a are invariant under the involutions Ti and T 2 - For example, we can 
write a = (1, 2, 3)(4, 5, 6 ) as r 2 o n, where T 2 = (1,4)(2, 6)(3, 5) and Ti = (1, 6)(2, 5)(3,4). 
Both involutions “exchange” the elements of {1, 2, 3} with those of {4, 5, 6 }. The next lemma 
asserts that there are no other possibilities. 

Lemma 2.2. Suppose O is the set of points on a cycle of a, and that a = T 2 o n is a 
factorization of a into two involutions. Then ti{0) = T 2 { 0 ), and ti{ 0) is the set of points 
on a cycle of a of length \ 0 \. 

Proof. Because each r* is a bijection, it is clear that \ti{0)\ = \t 2 { 0 )\ = \0\. 

Suppose 1 / 1 , 1/2 are points in We need to verify that yi and y 2 are on the same 

cycle of a. Let Xi,X 2 be the their preimages on O : Ti{xi) = yi, i = 1,2. Because Xi and 
X 2 are on the same cycle O, we have X 2 = c^^Xi) for some £. But then y 2 = Ti{a^{xi)) = 
n o {t 2 o tiY{xi) = (ti o T 2 Y o Ti{xi) = (T“^(|/i). Thus yi and y 2 are on the same cycle, and 
Ti{ 0 ) is a single cycle of length \ 0 \. 

Finally, note that r 2 = a o n. If a; G C>, then the set of points on the cycle of a that 
contains T 2 {x) is {n : n = ct* o T 2 {x) for some t G Z} = {n : n = o Ti{x) for some t G Z}, 
and the latter set is the set of points on the cycle of a that contains Ti{x). This proves that 
Ti( 0 ) = T 2 ( 0 ); the two involutions both map O to the same cycle. □ 

Definition 2.3. Let Oi and O 2 be two distinct sets of points on cycles of a. Two involutions 
ti,T 2 exchange Oi and O 2 provided that a = r 2 o n and ti{Oi) = T 2 {Oi) = 02 - 

Lemma 2.4. // a = (0,1, 2,..., n — l)(n, n + 1, n + 2,..., 2n — 1), then there are precisely 
n ways to write a as a product of two involutions of { 0 , 1 ,... , 2 n — 1 } that exchange the two 
cycles of a. 

Example: If n = 5, then one of the hve factorizations is (0,1, 2, 3,4) (5, 6 , 7, 8 , 9) = J 30 J 2 , 
where J 3 = (0, 8)(1, 7)(2, 6)(3, 5)(4, 9) and J 2 = (0, 7)(1, 6)(2, 5)(3, 9)(4, 8 ). 

Proof. Let X = {0,1,..., 2n — 1}. For integral k, dehne Jk to be the involution whose n 
transpositions are {x,n + k — x), x = 0,1, 2,...,n — 1. Note that Jk{,x) = .Jk±n{,x), so we 
are free to calculate the index k modulo n. Also note that iiy = n + k — x, then .Jkiy) = x. 
Hence it is straightforward to verify that, for any integer k, a = Jk o Jk-i- Since there are n 
choices for k, this proves that there at least n of the factorizations. 

Now suppose (T = S' o T for some involutions S and T on X, and suppose S and T 
exchange the two cycles of a. Because S exchanges the cycles of a, there must be some k 
for which S(0) = n + k. To prove the lemma, it suffices to prove that S = Jk and T = Jk-i- 
We use induction to show that, for 0 < i < n, S{i) = n + k — i and T{i) = n + k — 1 — i. 

For the base case i = 0, we already have S(0) = n + k. Note that T{n + k — 1) = 
3“^ oT{n + k — 1) = So a{n + k — 1) = S{n + k) = 0. Therefore T(0) = n + k — 1. This 
completes the base case i = 0 . 
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Now let 0 < i < n—1, and assume the inductive hypothesis. Since i = = ST{i—l), 

we have 

S{i) = o T{i — 1) = T{i — 1) n + k — 1 — {i — l)=n + k — i. 

ind.hypoth. 


Similarly 

T{n + k — i — l) = S‘^o T{n + k — i — l) = So a{n -\- k — i — 1) = S{n + k — i) = i. 
Therefore 

T{i) = n + k — 1 — i. 

□ 


For non-negative integers m and k dehne 


Vm{k) = 


[m/2j 

E 


j=0 


k ^m\ 

2Jj!(m - 2 ])\ 



( 2 . 2 ) 


where Hem is the “probabilists’ Hermite polynomial” if(x) = m\ r\(m-2r)\ We 

thank Victor Moll for pointing out this connection with the Hermite polynomials. A less 
general verion appears as equation 2 of Moser and Wyman [T9] . 


Theorem 2.5. (Lugo, Petersen, Tenner) If Ck{a) denotes the number of k-cycles that a E Sn 
has, then 

n 

fc=i 


Proof. By lemma 12.21 any involution factorization of a exchanges some number of pairs of 
cycles of the same size, and leaves the rest hxed. For each j < [cfc/2j, there are precisely 
2 Jj!(cfc- 2 j)! to match j pairs of fc-cycles for swapping, leaving the remaining Ck — 2j 

f-cycles to be hxed. Once the j pairs have been specihed, lemmas 12.11 and 12.41 show that 
there are ft. ways to factor the f-cycles. Hence, the total number of factorizations of 


n EjiV = n 


□ 


k=l 


k=l 


3 Approximation by 

Let T„((t) be the order of a as an element of the symmetric group, i.e. the least common 
multiple of the cycle lengths. The asymptotic distribution of T„ was deduced from that of 
B„. (See equation 14.4 of [10], section 7 of [6], and lemma 2 of [1].) A similar strategy is 
used in this paper. The goal of this section is to prove that B„ can serve as proxy for N„. 

The following deterministic lemma supplies a sufficient condition on a that, when satis- 
hed, imposes a bound on the error of the approximation. 
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Lemma 3.1. Suppose ^ > 1 and that, for every integer k > we have Cfc(cT) < 1. Also 
assume that, for every positive integer k, Ck{cr) < Then there is a constant c > 0, not 
dependent on a nor such that Bn(cr) < (a) < B„(cr) • 

Proof. We already have the lower bound (See equation 12 . ip . Observe that Vo(^) = 1 and 
Vi{k) = 1 for all fc G [n]. For 2 < m < ^ and 1 < k < a. very crude bound for Vm{k) 
suffices. For example, by Stirling’s formula we see that for 2 < m < ^, 

[m/2j ^ 

Vmik) < m! ^ < m\e^ < crn^, 

j=o ^ 

where c is a positive constant independent of k and m. By assumption Cfc(cr) < f for all 
k < f. Therefore 


N„((t) < B„(cr) ■ 



< B„(a) ■ 


□ 

Clearly B„((j) is not always a good approximation for N„(cr). For example, if a is the 
identity permutation with n cycles of length one, then log B„(cr) = 0 and log N„((j) ~ | log n. 
There is a tradeoff when applying lemma IXTl The parameter ^ = ^{n) must be sufficently 
large so that most permutations satisfy the hypotheses. However the larger ^ is, the cruder 
the bound. The next two lemmas make this precise. 

Lemma 3.2. If ^ = ^{n) oo as n ^ oo, and ifFn is the uniform probability measure on 
Sn, then Pn(cfc > 2 for some k > ^) = 0 (j). 

Proof. For any choice of Boole’s inequality implies that 

LfJ 

Pn(Cfc > 2 for some k > f) < ^Pn(Cfc > 2) = ^ [1 - P„(Cfc = 0) - Pn(cA: = 1)]. (3.1) 

k>^ fc=Rl 

It is well known that the probabilities Fn{ck = j) can be calculated using the Principle of 
Inclusion Exclusion, and that the alternating inequalites yield upper and lower bounds. (See 
also chapter 5 of Sachkov [21] for the “generatingfunctionological”approach). Thus 

L-J 

P„(c. = 0) = > 1-1 (3.2) 

j=0 ■>' 


and 
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(3.3) 


1 1 1/1 


j=0 


j\k^ k 


Putting fl3.2l) and fl3.3p into fl3.1l) . we get 


LfJ r 


IPn(cA; > 2 for some k > ^) < 

k=m ^ 




I - I 

fc=pl 


□ 


The second hypothesis is even more likely to hold. 

Lemma 3.3. If ^ = ^(n) ^ 00, then Pn(cfc > ^ for some k < f) = 0{{-^ + ^)). 

Proof. Let Z^, .^ < fc < n be a sequence of independent Poisson(l/A;) random variables. 
By theorem 4 of |5], Pn(cfc < f for all k < ^) = Pr(Zfc < ^ for all k < ^) + 0(|). Stan¬ 
dard estimates using Markov’s inequality and moment generating functions shows that this 
probability is small; 

Pr(Zfc > ^) = Pr(e^'' > 

E(e^fc) 8 

< —^^ = - < — . 

- e« e« e« 

Therefore ^ 

Pr(Z,<eforall A;<0> (l-4) =^-^(4)' 


4 The Asymptotic Lognormality of N 

It is well known that is asymptotically lognormal. 

Lemma 4.1. (Erdos and Turdn) For any real number x, 

lim P„(logB„(cr) < yLn + xan) = 4)(x) 


^1 ^ 9 1 

where yUn = E ^ ~ - log^n, cx^ = E ~ ^ and <h(x) = ^ /E 

Remark 4.2. The first proof lemma ST] is in the work of Erdos and Turan [H]. Alternative 
proofs, as well as stronger and more general results have been proved using quite varied 
techniques. See, for example, 0 , 0 , H, 0, IH. 
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Theorem 4.3. P„(logN„(cr) < /in + xa-n) = $(a:) + o(l). 

Proof. Because Nn (cr) > Bn (a) for all a E Sn, one direction is an immediate consequence of 
lemma 14.11 


Pn(logNn < /in + Xan) < Pn(logBn < fJ-n + Xan) = 4>(x) + o(l). (4.1) 


For the other direction, we use the continuity of <F and the bound Nn(cT) < (ce«)«Bn (a) from 
Lemma ITTI which, due to lemma 13^ and lemma 13731 holds with probability 1 —0(| + ^ + ^). 

In more detail, let e > 0 be a fixed but arbitrarily small postive number. We can choose 
5 > 0 so that |4>(x) — <F(a)| < e whenever |x — a| < 5. If we choose f = ^logn, then we have 
log((cC^)^) = o(crn). Therefore we can choose W so that, for all n > W, log((c^^)^) < 

But then 

Pn(logN(cr) < fin + xan) > Pn(logB(a) + log((c^^)^) < fin + XUn) (4.2) 

> Pn ^logB(cr) + ^ < fin + xan^ (4.3) 

= Pn l^log B(cr) < /in + - 0 CTn^ (4.4) 

= $0 - 0 + o(l) > <F(x) - e + o(l) (4.5) 

Yet e > 0 was arbitrary, and so Pn(logN(cT) < fin + xan) > ‘h(x) + o(l). □ 
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